Discovering keywords from cross-modal input: ecological vs. engineering methods for enhancing acoustic repetitions
نویسندگان
چکیده
This paper introduces a computational model that automati cally segments acoustic speech data and builds internal repre sentations of keyword classes from cross-modal (acoustic and pseudo-visual) input. Acoustic segmentation is achieved using a novel dynamic time warping technique and the focus of this paper is on recent investigations conducted to enhance the iden tification of repeating portions of speech. This ongoing research is inspired by current cognitive views of early language acqui sition and therefore strives for ecological plausibility in an at tempt to build more robust speech recognition systems. Results show that an ad-hoc computationally engineered solution can aid the discovery of repeating acoustic patterns. However, we show that this improvement can be simulated in a more ecolog ically valid way.
منابع مشابه
Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching
This paper presents a computational model that can automati cally learn words, made up from emergent sub-word units, with no prior linguistic knowledge. This research is inspired by cur rent cognitive theories of human speech perception, and there fore strives for ecological plausibility with the desire to build more robust speech recognition technology. Firstly, the par ticulate structure ...
متن کاملDamage identification of structures using experimental modal analysis and continuous wavelet transform
Abstract: Modal analysis is a powerful technique for understanding the behavior and performance of structures. Modal analysis can be conducted via artificial excitation, e.g. shaker or instrument hammer excitation. Input force and output responses are measured. That is normally referred to as experimental modal analysis (EMA). EMA consists of three steps: data acquisition, system identificatio...
متن کاملOutput-only Modal Analysis of a Beam Via Frequency Domain Decomposition Method Using Noisy Data
The output data from a structure is the building block for output-only modal analysis. The structure response in the output data, however, is usually contaminated with noise. Naturally, the success of output-only methods in determining the modal parameters of a structure depends on noise level. In this paper, the possibility and accuracy of identifying the modal parameters of a simply supported...
متن کاملMultimodal and Cross-modal Processing in Interactive Systems Based on Tangible Acoustic Interfaces
This paper presents some recent developments at DISTInfoMus Lab on multimodal and cross-modal processing of multimedia data streams with a particular focus on interactive systems exploiting Tangible Acoustic Interfaces (TAIs). In our research multimodal and cross-modal algorithms are employed for enhancing the extraction and analysis of the expressive information conveyed by gesture in non-verb...
متن کاملGrounded spoken language acquisition: experiments in word learning
| Language is grounded in sensory-motor experience. Grounding connects concepts to the physical world enabling humans to acquire and use words and sentences in context. Currently most machines which process language are not grounded. Instead, semantic representations are abstract, pre-speci ed, and have meaning only when interpreted by humans. We are interested in developing computational syste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009